Differential Intercepts and Interactions

Author
Affiliation

Dave Clark

Binghamton University

Published

March 23, 2025

Differential Intercepts

Goals

The models we’ve considered so far have generally assumed structural stability - that the intercepts and slopes are shared by all groups in the data.

We’ll consider two possibilities:

  • groups in the data have different intercepts - test this with group indicator variables (dummies).

  • groups in the data have different slopes with respect to the same variables - test this using multiplicative interactions

These are not mutually exclusive - many models employ both.

Indicator Variables

Suppose the following regression:

\[Y=\beta_0+\beta_1(D)+\varepsilon \]

where \(D\) is an indicator variable, and \(y\) is a continuous variable. Where \(D\) is constructed just to distinguish between two groups, say, \(D=1\) for the US South and \(D=0\) otherwise, we usually call these dummy variables.

Indicator Variables

Now, consider the conditional expected values of \(Y\):

\[E[Y|D_1=0]= \beta_0 = \bar{Y} |D=0 \nonumber \]

\[E[Y|D_1=1]= \beta_0 + \beta_1 = \bar{Y}|D=1 \]

So these are just the conditional means of \(Y\); the conditions are given by the values of \(D\) and represent the subsamples for \(D=0\) and \(D=1\).

Differential Intercept

The dummy variable estimate represents the differential intercept, or the difference in the mean of \(Y|D=0\) and the mean of \(Y|D=1\). The estimate \(\beta_1\) is the difference in height on the \(y\) axis between the two groups in \(D\). The actual intercepts are:

\[Y|D=0 = \beta_0\]

\[Y|D=1 = \beta_0+\beta_1\]

code
## differential intercepts ----
set.seed(12345)
data <- tibble(
X <- rnorm_multi(1000, 3, 
                     mu=c(0, 5, 0), 
                     sd=c(1, 5, .5),
                     r = c(0,0, 0.0),
                     varnames=c("x1", "x2", "e"))
  ) %>%
    mutate(x2= ifelse(x2>median(x2),1,0)) %>%
    mutate(y = 1 + 1*x1 + 2*x2 + e)
    m1 <- (lm(y ~ x1 + as.factor(x2), data=data))
  
### intercept differences only  ----
  
data1 <- data %>% mutate(x1=0)  
predint <- data.frame(data, predict(m1, interval="confidence", se.fit=TRUE, newdata = data1))
predint <- predint %>% mutate(ub=fit.fit +1.96*se.fit, lb=fit.fit -1.96*se.fit) #absurd CI for illustration 

ggplot(data=predint, aes(x=x2, y=fit.fit)) +
  geom_point(size = 1) +
  geom_pointrange(aes(ymin=lb, ymax=ub)) +
  scale_x_continuous(breaks = c(0,1)) +
  labs ( colour = NULL, x = "x2", y =  "Predicted xb" ) +
  theme_minimal()+
  ggtitle("Differential Intercepts")  +
  annotate("text", x = .1, y = 1.5, label = "B0" , size=3.5, colour="gray30") +
  annotate("text", x = .9, y = 3.3, label = "B0 + B1", size=3.5, colour="gray30") 

Differential Intercept

Note this logic extends to multiple indicator variables where \(\beta_0\) measures the intercept for the excluded category (all indicators set to zero), and \(\beta_0+\beta_1\) measures the intercept for any other category where its indicator is set to one, all others to zero.

Multiple Indicator Variables

Thinking of the Ill-Treatment models we’ve been working with:

\[ Y=\beta_0+\beta_1(D_{1})+\beta_{2}(D_{2}) + \ldots + \varepsilon \]

where \(D_{1}=1\) indicates a civil war, and \(D_{2}=1\) indicates the government restricts IGO access:

  • \(E[Y]\) for a non civil war, no restriction country is \(\beta_{0}\).

  • \(E[Y]\) for a civil war state with no restrictions is \(\beta_{0}+\beta_{1}\)

  • \(E[Y]\) for a non civil war state with restricted access is \(\beta_{0}+\beta_{2}\)

  • \(E[Y]\) for a civil war state that restricts access is \(\beta_{0}+\beta_{1}+\beta_{2}\)

code
itt <- read.csv("/Users/dave/Documents/teaching/501/2023/exercises/ex4/ITT/data/ITT.csv")

itt <- 
  itt%>% 
  group_by(ccode) %>%
  mutate( lagprotest= lag(protest), lagRA=lag(RstrctAccess), n=1) %>%
  mutate(interact= civilwar*lagRA)


m3 <- lm(scarring ~ lagRA + civilwar  + lagprotest + polity2 +wdi_gdpc+wdi_pop, data=itt)
# summary(m1)
# average predictions, end point boundaries
# estimation sample
itt$used <- TRUE
itt$used[na.action(m3)] <- FALSE
ittesample <- itt %>%  filter(used=="TRUE")

#across 4 combos of RA and CW

predictions <- data.frame(case=seq(1,4,1), xb=0, se=0, ub=0, lb=0, model=0 )

avg <- ittesample %>% mutate(civilwar=0, lagRA=0 )
all0 <- data.frame(predict(m3, interval="confidence", se.fit=TRUE, newdata = avg))
predictions[1:1,2:3]<- data.frame(xb=median(all0$fit.fit, na.rm = TRUE), seall0=median(all0$se.fit,na.rm = TRUE))

avg <- ittesample %>% mutate(civilwar=1, lagRA=0 )
cw <- data.frame(predict(m3, interval="confidence", se.fit=TRUE, newdata = avg))
predictions[2:2,2:3]<- data.frame(xb=median(cw$fit.fit, na.rm = TRUE), seall0=median(cw$se.fit,na.rm = TRUE))

avg <- ittesample %>% mutate(civilwar=0, lagRA=1 )
ra <- data.frame(predict(m3, interval="confidence", se.fit=TRUE, newdata = avg))
predictions[3:3,2:3]<- data.frame(xb=median(ra$fit.fit, na.rm = TRUE), seall0=median(ra$se.fit,na.rm = TRUE))

avg <- ittesample %>% mutate(civilwar=1, lagRA=1 )
all1 <- data.frame(predict(m3, interval="confidence", se.fit=TRUE, newdata = avg))
predictions[4:4,2:3]<- data.frame(xb=median(all1$fit.fit, na.rm = TRUE), seall0=median(all1$se.fit,na.rm = TRUE))

predictions <- predictions %>% mutate(ub=xb +1.96*se , lb =xb -1.96*se)


#plot
ggplot(data=predictions, aes(x=case, y=xb)) +
  geom_pointrange(data=predictions, aes(ymin=lb, ymax=ub)) +
  labs ( colour = NULL, x = "", y =  "Expected Scarring Torture Reports" ) +
  guides(x="none")+
  annotate("text", x = 1.6, y = 5, label = "No restriction, no civil war", size=3.5, colour="gray30")+
  annotate("text", x = 2, y = 9.5, label = "No restriction, civil war", size=3.5, colour="gray30")+
  annotate("text", x = 3, y = 13, label = "Restriction, no civil war", size=3.5, colour="gray30")+
  annotate("text", x = 3.2, y = 22, label = "Restriction and civil war", size=3.5, colour="gray30")+
  theme_minimal()

Fixed Effects

Let’s take this notion of differential intercepts one final step. Suppose we estimate a model of Ill-Treatment and include a dummy variable for each country (minus one for the excluded category).

code
# fixed effects ----
itt$c <- as.factor(itt$ctryname)
itt <- itt %>% filter(!is.na(c))
# set ref category to US
# itt <- within(itt, cname <- relevel(cname, ref = "United States of America"))

# be sure NA and "" are not categories in factor var cname
m4 <- lm(scarring ~ c, data=itt%>%filter(c!=""))
#modelsummary(m4)

library(stargazer)
stargazer(m4, type="html",  single.row=TRUE, header=FALSE, digits=3,  omit.stat=c("LL","ser"),  star.cutoffs=c(0.05,0.01,0.001),  column.labels=c("Fixed Effects"),  dep.var.caption="Dependent Variable: Scarring Torture", dep.var.labels.include=FALSE, notes=c("Standard errors in parentheses", "Significance levels:  *** p<0.001, ** p<0.01, * p<0.05"), notes.append = FALSE,  align=TRUE,  font.size="small")
Dependent Variable: Scarring Torture
Fixed Effects
cAlbania 4.717 (3.487)
cAlgeria -3.283 (3.487)
cAngola 6.444 (3.487)
cArgentina 3.990 (3.487)
cArmenia 4.644 (3.565)
cAustralia -4.156 (3.565)
cAustria -0.556 (3.565)
cAzerbaijan 3.263 (3.487)
cBangladesh 1.717 (3.487)
cBelarus 0.990 (3.487)
cBelgium 1.172 (3.487)
cBenin -4.556 (6.065)
cBolivia 0.778 (3.658)
cBosnia and Herzegovina -4.306 (4.662)
cBrazil 18.626*** (3.487)
cBulgaria 9.626** (3.487)
cBurkina Faso -3.413 (3.910)
cBurundi 1.444 (3.487)
cCambodia -3.656 (3.565)
cCameroon -0.456 (3.565)
cCanada -4.181 (3.770)
cCentral African Republic 5.778 (5.173)
cChad -0.856 (3.565)
cChile -0.856 (3.565)
cChina 28.354*** (3.487)
cColombia 5.172 (3.487)
cCongo (Brazzaville, Republic of Congo) -2.356 (3.565)
cCosta Rica -3.889 (5.173)
cCote d’Ivoire -2.374 (3.487)
cCroatia 0.144 (3.565)
cCuba 2.354 (3.487)
cCzech Republic -0.756 (3.565)
cDemocratic Republic of the Congo (Zaire, Congo-Kinshasha) 6.778 (4.089)
cDenmark -5.056 (3.565)
cDominican Republic -4.000 (3.658)
cEast Timor -2.556 (5.173)
cEcuador -0.828 (3.487)
cEgypt 5.354 (3.487)
cEl Salvador -4.222 (4.089)
cEritrea -3.681 (3.770)
cEstonia -4.806 (4.662)
cEthiopia -3.181 (3.770)
cFinland -5.556 (5.173)
cFrance 2.535 (3.487)
cGabon -4.889 (5.173)
cGambia -3.444 (3.658)
cGeorgia 4.990 (3.487)
cGermany 9.172** (3.487)
cGhana -3.270 (3.910)
cGreece 4.899 (3.487)
cGuatemala -3.283 (3.487)
cGuinea -1.556 (3.910)
cGuinea-Bissau 1.844 (4.328)
cGuyana -3.681 (3.770)
cHaiti -1.556 (3.658)
cHonduras -3.856 (3.565)
cHungary 2.744 (3.565)
cIndia 4.899 (3.487)
cIndonesia 27.263*** (3.487)
cIran 0.535 (3.487)
cIraq -2.756 (3.565)
cIreland -4.556 (6.065)
cIsrael 5.444 (3.487)
cItaly 11.172** (3.487)
cJamaica -1.919 (3.487)
cJapan -3.646 (3.487)
cJordan -1.010 (3.487)
cKazakhstan -1.646 (3.487)
cKenya 8.535* (3.487)
cKuwait -5.306 (3.770)
cKyrgyzstan -3.889 (3.658)
cLaos -2.737 (3.487)
cLatvia -4.889 (5.173)
cLebanon 0.244 (4.328)
cLesotho -3.556 (3.770)
cLiberia 1.819 (3.770)
cLibyan Arab Jamahiriya -3.374 (3.487)
cLithuania -4.556 (6.065)
cMacedonia 4.778 (3.658)
cMadagascar -0.556 (6.065)
cMalawi -4.889 (4.089)
cMalaysia 0.899 (3.487)
cMali -4.556 (8.178)
cMauritania -2.919 (3.487)
cMauritius -3.556 (4.662)
cMexico 15.626*** (3.487)
cMoldova -3.101 (3.487)
cMongolia -4.556 (6.065)
cMorocco -1.737 (3.487)
cMozambique -1.556 (3.565)
cMyanmar 15.626*** (3.487)
cNamibia -1.698 (3.910)
cNepal 7.263* (3.487)
cNetherlands -4.556 (4.328)
cNicaragua -4.389 (4.089)
cNiger -2.056 (4.662)
cNigeria 1.717 (3.487)
cNorth Korea -3.101 (3.487)
cNorway -5.556 (6.065)
cOman -4.756 (4.328)
cPakistan 1.081 (3.487)
cPanama -5.056 (4.662)
cPapua New Guinea -2.667 (3.658)
cParaguay -2.465 (3.487)
cPeru 1.263 (3.487)
cPhilippines 2.808 (3.487)
cPoland -3.756 (3.565)
cPortugal 4.354 (3.487)
cQatar -4.931 (3.770)
cRomania 12.844*** (3.565)
cRussian Federation 23.717*** (3.487)
cRwanda 6.808 (3.487)
cSaudi Arabia -0.828 (3.487)
cSenegal 2.444 (3.658)
cSerbia and Montenegro -5.556 (8.178)
cSierra Leone -1.841 (3.910)
cSingapore -5.056 (4.662)
cSlovakia 0.111 (3.658)
cSlovenia -3.956 (4.328)
cSouth Africa -0.010 (3.487)
cSouth Korea -2.222 (3.658)
cSpain 7.172* (3.487)
cSudan 10.354** (3.487)
cSuriname -4.556 (8.178)
cSwaziland -2.181 (3.770)
cSweden -3.465 (3.487)
cSwitzerland 3.944 (3.565)
cSyria 5.808 (3.487)
cTajikistan -2.101 (3.487)
cTanzania 1.544 (3.565)
cThailand -3.756 (3.565)
cTogo 1.626 (3.487)
cTrinidad and Tobago -2.306 (3.770)
cTunisia 3.081 (3.487)
cTurkey 54.172*** (3.487)
cTurkmenistan 1.081 (3.487)
cUganda -0.556 (3.487)
cUkraine -0.737 (3.487)
cUnited Arab Emirates -4.681 (3.770)
cUnited Kingdom 2.263 (3.487)
cUnited States of America 18.899*** (3.487)
cUruguay -5.222 (3.658)
cUzbekistan 14.354*** (3.487)
cVenezuela 2.944 (3.565)
cViet Nam -4.556 (3.487)
cYemen -1.465 (3.487)
cZambia -1.828 (3.487)
cZimbabwe 3.717 (3.487)
Constant 5.556* (2.586)
Observations 1,317
R2 0.550
Adjusted R2 0.493
F Statistic 9.648*** (df = 148; 1168)
Note: Standard errors in parentheses
Significance levels: *** p<0.001, ** p<0.01, * p<0.05

Fixed Effects Plot

Here’s a plot of the fixed effect coefficients - each is a differential intercept, measuring the difference from the reference category.

code
# reference category is AFG by default 
# coefficients as data frame for plotting
coefs<- data.frame(coef(summary(m4)))
coefs$country <- rownames(coefs)
coefs$country <- substr(coefs$country, 2, nchar(coefs$country))
coefs$country[coefs$country=="Intercept)"] <- "Intercept"
coefs$sig <-ifelse(coefs$`Pr...t..`<.05, 1,0)


## plot fixed effects ----

p <- ggplot(coefs, aes(x = Estimate, y = country, color = factor(sig), label = country)) +
  geom_point(size = 1) +
  geom_errorbarh(aes(xmin = Estimate - 1.96 * Std..Error, xmax = Estimate + 1.96 * Std..Error)) +
  geom_text_repel(data=coefs%>%filter(sig==1),  size=2.5, force=5) + 
  theme(axis.text.y = element_blank(),
        axis.ticks.y = element_blank())+
  labs ( colour = NULL, y = "", x =  "Expected Scarring Torture Reports" ) +
  ggtitle("Fixed Effects", subtitle = "Country Coefficients")

p + scale_color_manual(values = c("red", "black")) +
  guides(color="none")

The result is that each coefficient is an intercept for a particular country. Each measures the difference in that country’s intercept (or mean of \(y\)) from the excluded category, in this case, Afghanistan. Black bars are different from the excluded category at the .05 level.

Each country coefficient plus the intercept is that country’s mean of \(y\), scarring torture . Looking, for instance at the US, the coefficient is about 19, so the US mean scarring torture reports is 19 plus the intercept (5.5), so about 24.5 (the actual mean for US scarring torture reports is 24.45).

Intercept Shifts with Continuous Variables

Suppose we have a regression with a dummy variable and a continuous variable:

\[y = \beta_0 + \beta_1 d_1 + \beta_2 x_2 \] The expected value of \(y\) for each group is:

\[E[y | d_1= 0] = \beta_0 + \beta_2 x_2 \]

\[E[y | d_1= 0] = \beta_0 + \beta_1 + \beta_2 x_2 \]

Note the shift in intercept.

code
ggplot(data=filter(data, x2==0), aes(x=x1, y=y)) +
  geom_smooth(method="lm", se=FALSE, color="black") +
  geom_smooth(data=filter(data, x2==1), aes(x=x1, y=y), method="lm", se=FALSE, color="black") +
  labs ( colour = NULL, x = "x1", y =  "Predicted xb" ) +
  theme_minimal()+
  annotate("text", x = 0, y = 0, label = "x2=0", size=3.5, colour="gray30")+
  annotate("text", x = 0, y = 4, label = "x2=1", size=3.5, colour="gray30")+
  ggtitle("Differential Intercepts, Continuous x1")

The difference is only in intercept on \(d\), \(\beta_{1}\) - the slope estimate for \(x_2\) is the same no matter the value of \(d_{1}\).

Put differently, these two groups given by the dummy variable share the same slope, but have different y-intercepts.

Structural Stability

In this regression, the slope on \(x_2\) is the same for all groups measured by indicator variables.

\[y=\beta_0 + \beta_1 d_1 +\beta_2x_1+ \varepsilon \]

Assuming a common slope is known as the structural stability assumption.

Dummy Variables

  • Dummy variables are useful for measuring differences between groups.

  • Dummy coefficients specifically measure the differences in levels (intercepts) between groups.

  • Dummies may capture differences between known, discrete groups, e.g. genders, parties, races, etc.

  • Dummies might capture unknown differences between units, say between states or countries - this is the foundation of fixed effects.

Questioning Structural Stability

We may well have reason to doubt structural stability. In other words, we might think the slope on \(x\) for one group is increasing fast, while it increases more slowly for the other group.

So we have this regression:

\[y = \beta_0 + \beta_1 d_1 + \beta_2 x_2 \]

But we think

\[y = \beta_0 + \beta_1 d_1 + .5 x_2 ~~ \text{if d=0}\]

and

\[y = \beta_0 + \beta_1 d_1 + 1.7 x_2~~ \text{if d=1} \]

For instance, we might suppose that the effect of protests on scarring torture is different in states that restrict IGO access than in states that do not. States that do not restrict access may engage in more torture as protests increase, but more slowly than states that do restrict access so do not have to hide their activities. So restricting access may moderate the effect of protests on torture, whereas not restricting access to IGOs may accelerate that effect.

Structural Stability

If this is the case, then the marginal effect of \(x_2\) is not \(\beta_2\); instead, it is .5 for one group, and 1.7 for the other.

Recall, the marginal effect of \(x_2\) in the model where we assume structural stability:

\[y = \beta_0 + \beta_1 d_1 + \beta_2 x_2 \]

is \(\beta_2\).

Illustration

code
data <- data %>% mutate(interaction=x1*x2) %>%
  mutate(yi = 1 + 1*x1 + 2*x2 + 1.5*interaction + e)
m1 <- (lm(y ~ x1 + x2, data=data))
m2 <- (lm(yi ~ x1 + x2 + x1:x2, data=data))
#summary(m2)  

#using sjPlot
p1 <- plot_model(m1, type="eff", terms=c("x1", "x2"), show.values = TRUE, show.p = TRUE, title="Differential Intercept") +
  labs ( colour = NULL, x = "x1", y =  "Predicted xb" ) +
  guides(colour="none") +
  annotate("text", x = 0, y = -1, label = "x2=0", size=2.5, colour="gray30")+
  annotate("text", x = 0, y = 5, label = "x2=1", size=2.5, colour="gray30")+ theme_minimal()
  
p2 <-  plot_model(m2, type="eff", terms=c("x1", "x2"), show.values = TRUE, show.p = TRUE, title="Interaction, x1*x2") +
  labs ( colour = NULL, x = "x1", y =  "Predicted xb" ) +
  guides(colour="none") +
  annotate("text", x = 0, y = -2, label = "x2=0", size=2.5, colour="gray30")+
  annotate("text", x = 0, y = 5, label = "x2=1", size=2.5, colour="gray30")+
  theme_minimal()

p1/p2

In the first panel, we see the differential intercepts - the slope on \(x_1\) is the same for both groups, but the intercepts are different. This is structural stability. The bottom panel relaxes the structural stability assumption, allowing the slope on \(x_1\) to differ between the two groups.

Structural Stability

We relax the structural stability assumption by modeling a multiplicative interaction:

\[y = \beta_0 + \beta_1 d_1 + \beta_2 x_2 + \beta_3 (d_1x_2)\]

literally multiplying \(d_1\) and \(x_2\) together, and including that new variable in the model. \(\beta_3\) measures the difference in slope between the two groups in \(d_1\).

Multiplicative Interactions

Let’s work with the ITT data for the sake of continuity. The basic model we’re working with is predicting reports of scarring torture. In general, my argument is that scarring torture is potentially a tool regimes use to signal to dissidents they should watch themselves. Insofar as violent repression is costly, regimes will use scarring torture reduce dissent, thereby making violent repression less necessary. So when protest frequency is higher, we should see more scarring torture as regimes attempt to quell dissent.1

Regime transparency will modify this relationship. Regimes that restrict access to IGOs will have lower costs for using violence (scarring torture), so should employ it at a higher rate than will regimes that do not restrict IGO access. This implies the following hypothesis:

\(H_1\): Protests will be positively related to scarring torture; their effect will be stronger in states that restrict IGO access than in states that do not.

Alternatively, we can state it this way:

\(H_1\): Increases in protests will produce faster increases in scarring torture in states that restrict IGO access than in states that do not.

You should note this implies two slopes - one for states that restrict IGO access, and one for states that do not. The slope effect for protests should be larger in states that restrict access.

Since we’re positing different slopes, we’re explicitly relaxing the structural stability assumption.

Structural Stability

In this regression:

\[y = \beta_0 + \beta_1 d_1 + \beta_2 x_2 \]

The slope on \(x_2\) is the same for both groups represented by \(d_1\). The model assumes structural stability by restricting both groups in \(d_1\) to have the same slope on \(x_2\).

Note that the marginal effect of \(x_2\) is \(\beta_2\) for both groups represented by \(d_1\).

We relax structural stability by adding a multiplicative interaction between \(d_1\) and \(x_2\). This allows the slope on \(x_2\) to differ between the two groups represented by \(d_1\). Note that whether or not the two groups have different slopes is now the matter of a hypothesis test.

Multiplicative Interactions

In this regression:

\[y = \beta_0 + \beta_1 d_1 + \beta_2 x_2 + \beta_3 (d_1x_2)\]

The slope for \(x_2\) is \(\beta_2\) when \(d_1=0\) and \(\beta_2+\beta_3\) when \(d_1=1\). The interaction term, \(\beta_3\), measures the difference in slopes between the two groups represented by \(d_1\).

Note that interpretation of \(x_2\) is now conditional on \(d_1\) - we cannot interpret the effect of \(x_2\) without reference to \(d_1\).

Estimating the model

Brambor, Clark, and Golder (2006) provide a great discussion of multiplicative interactions in regression models. Berry, Golder, and Milton (2012) provide further developments and particularly focus on hypotheses to test from interaction models. Recently, Clark and Golder (2023) present a comprehensive treatment of interactions in regression models (along with complete replication code). Resources for the book and articles are available on Matt Golder’s website.

Basics of interactions

A regression with the multiplicative interaction of two variables, \(x_1\) and \(x_2\):

  • includes \(x_1\) and \(x_2\); these are called constituent terms of the interaction. Always include the constituents.

  • includes a new variable \(x_1 * x_2\); this is the interaction term.

  • includes whatever other variables per usual.

  • interpretation is always conditional, so we cannot interpret the effects of the constituents or interaction term without reference to the other.

Suppose the following model:

\[y = \beta_0 + \beta_1 d_1 + \beta_2 x_2 + \beta_3 (d_1x_2)\]

The expected value of \(y\) for the case where \(d_1=0\) is:

\[E[y|x, d_1=0] = \beta_0 + \beta_2 x_2\]

Where \(d_1=1\), the expected value of \(y\) is:

\[E[y|x; x|d_1=1] = \beta_0+\beta_1d_1 + \beta_2 x_2 +\beta_3 x_2d_1 \]

Here, the expected value of \(y\) depends on the intercept given by \(\beta_0\) and \(\beta_1\), and on slopes determined by \(x_2\), but those slopes differ depending on the value of \(d_1\).

Marginal Effects

An important insight: the interaction coefficient, \(\beta_3\), measures the difference in slopes between the groups represented by \(d=0\) and \(d=1\).

This means the marginal effect of a change in \(x_2\) is now \(\beta_1+\beta_3*x_2\).

Example

Let’s look at the scarring torture model interacting protests and restricted access. The model looks like this:

\[scarring = \beta_0 + \beta_1 protests + \beta_2 restricted + \beta_3 (protests*restricted) + X\beta\] where \(X\) is a matrix of control variables.

code
#  continuous/binary interaction ----
# now, interact protests and restricted access
itt <- read.csv("/Users/dave/Documents/teaching/501/2023/exercises/ex4/ITT/data/ITT.csv")

itt$p1 <- itt$polity2+11
itt$p2 <- itt$p1^2
itt$p3 <- itt$p1^3

itt <- 
  itt%>% 
  group_by(ccode) %>%
  mutate( lagprotest= lag(protest), lagRA=lag(RstrctAccess), n=1) %>%
  mutate(gdp=wdi_gdpc /1000) %>%
  mutate(pop=wdi_pop/100000) %>%
  mutate(interaction=lagprotest*RstrctAccess) %>%
  ungroup()

m4 <- lm(scarring ~  lagprotest + as.factor(RstrctAccess) +interaction +civilwar  + p1 +gdp+pop  , data=itt)

stargazer(m4, type="html",  single.row=TRUE, header=FALSE, digits=3,  omit.stat=c("LL","ser"),  star.cutoffs=c(0.05,0.01,0.001),  column.labels=c("OLS Estimates"),  dep.var.caption="Scarring Torture", dep.var.labels.include=FALSE,  covariate.labels=c("Protests, t-1", "Restricted Access","Protests (t-1)*Restricted Acc.",  "Civil War",  "Polity", "GDP per capita", "Population"),  notes=c("Standard errors in parentheses", "Significance levels:  *** p<0.001, ** p<0.01, * p<0.05"), notes.append = FALSE,  align=TRUE,  font.size="small")
Scarring Torture
OLS Estimates
Protests, t-1 0.136 (0.074)
Restricted Access 8.494*** (1.241)
Protests (t-1)*Restricted Acc. 0.783*** (0.230)
Civil War 7.368*** (1.549)
Polity 0.036 (0.052)
GDP per capita 0.009 (0.031)
Population 0.001** (0.0002)
Constant 4.594*** (0.769)
Observations 971
R2 0.213
Adjusted R2 0.207
F Statistic 37.182*** (df = 7; 963)
Note: Standard errors in parentheses
Significance levels: *** p<0.001, ** p<0.01, * p<0.05

Let’s look at the estimates and think about what we can and cannot say. Recall the estimates on protests and restricted access are now conditional on the other, so we cannot make unconditional statements. We can say the effect of protests on scarring torture is 0.13 when access is restricted, and restricted access on scarring torture is about 8.5 cases when there are zero protests.

There is no effect of protests on scarring torture when restricted access is zero. When access is restricted, the effect of protests on scarring torture is \(\beta_1+\beta_3*x_2\) (\(.14+.78*1 = .92\)), so the sum of the uncondtional slope for protests and the interaction slope which represents the difference in slope effect for protests between restricted and unrestricted access. This is the marginal effect of a change in protests on scarring torture given a change in access.

Inference

To say whether this is different from zero, we need to compute a standard error for the sum of these two coefficients. We can’t just add the two standard errors together, but constructing one from the variance-covariance matrix of \(\beta\) is easy:

\[ se(\beta_{1}+ \beta_{3})= \sqrt{var(\beta_{1})+ X_{2}^{2}var(\beta_{3})+2X_{2}cov(\beta_{1},\beta_{3})} \] This how we’d compute the standard error for the sum of the two coefficients. Golder and his colleagues provide guidance for computing standard errors for a variety of models including two-way interactions (as above), three way interactions, and for models with quadratic terms - see the two figures below, both from Golder’s website

Interaction standard errors Quadratic interaction standard errors

Quantities of Interest

  • linear predictions - computed as usual (using any of the methods we’ve learned, e.g, at-mean effects, average effects, simulated effects), this is a principle quantity of interest.

  • We might also consider the marginal effects of changes in one variable on the expected value of \(y\). The marginal effect is simply the change in \(y\) given a change in \(x\).

In the non-interactive model, the marginal effect of \(x_2\) is:

\[\frac{\partial{y}}{\partial{x_{1}}}= \beta_{1}\]

In the interactive model, referring to the torture model above, the marginal effect of a change in \(d_1\) (access) is:

\[\frac{\partial{y}}{\partial{d_1}}= \beta_2+\beta_3*x_1 \]

Note this depends on the values of \(x_1\), protests, allowing for the possibility the effect of protests accelerates or slows. We would say this is the marginal effect of a change in access on scarring torture given a change in protests.

Interactions in regression models are symmetric in the sense that we can be interested in the effect of \(d_1\) on \(y\) given \(x_2\) or the effect of \(x_2\) on \(y\) given \(d_1\). Which you’re interested in is up to you and is a matter of theory.

The (symmetric) marginal effect of \(x_1\), protests, on \(y\) given \(d_1\) (access) is:

\[\frac{\partial{y}}{\partial{x_1}}= \beta_1+\beta_3*d_1 \]

It should make sense that if \(d_1=0\), the effect is just \(\beta_1\). If \(d_1=1\), we adjust the slope by \(\beta_3\), thereby allowing the slopes to be different for the two groups in \(d_1\), relaxing structural stability. This is also a reminder that we can only interpret these coefficients conditional on one another.

Linear Predictions (Average Effects)

Here are predictions computed as average effects for the scarring torture model. Note we have predictions for two groups (restricted and unrestricted access), and their slopes are different.

code
# estimation sample
itt$used <- TRUE
itt$used[na.action(m4)] <- FALSE
ittesample <- itt %>%  filter(used=="TRUE")


# loop over number of protests
pred_data <-ittesample
protests <-0
medxbr0 <-0
ubxbr0 <-0
lbxbr0 <-0
medxbr1 <-0
ubxbr1 <-0
lbxbr1 <-0
pred_data$RstrctAccess <- 0
for(p in seq(1,40,1)) {
  pred_data$lagprotest <- p 
  pred_data$interaction <- 0
  protests[p] <- p 
  allpreds <- data.frame(predict(m4, interval="confidence", se.fit=TRUE, newdata = pred_data))  
  medxbr0[p] <- median(allpreds$fit.fit, na.rm=TRUE)
  ubxbr0[p] <- median(allpreds$fit.fit, na.rm=TRUE)+1.96*(median(allpreds$se.fit, na.rm=TRUE))
  lbxbr0[p] <- median(allpreds$fit.fit, na.rm=TRUE)-1.96*(median(allpreds$se.fit, na.rm=TRUE))
}
pred_data$RstrctAccess <- 1
for(p in seq(1,40,1))  {
  pred_data$lagprotest <- p
  pred_data$interaction <- p
  allpreds <- data.frame(predict(m4, interval="confidence", se.fit=TRUE, newdata = pred_data))  
  medxbr1[p] <- median(allpreds$fit.fit, na.rm=TRUE)
  ubxbr1[p] <- median(allpreds$fit.fit, na.rm=TRUE)+1.96*(median(allpreds$se.fit, na.rm=TRUE))
  lbxbr1[p] <- median(allpreds$fit.fit, na.rm=TRUE)-1.96*(median(allpreds$se.fit, na.rm=TRUE))
}

df <- data.frame(medxbr0, ubxbr0,lbxbr0,medxbr1, ubxbr1, lbxbr1, protests)


#plotting

ggplot() +
  geom_ribbon(data=df, aes(x=protests, ymin=lbxbr0, ymax=ubxbr0),fill = "grey70", alpha = .4, ) +
  geom_ribbon(data=df, aes(x=protests, ymin=lbxbr1, ymax=ubxbr1), fill= "grey60",  alpha = .4, ) +
  geom_line(data= df, aes(x=protests, y=medxbr0))+
  geom_line(data= df, aes(x=protests, y=medxbr1))+
  labs ( colour = NULL, x = "Protests Against Government", y =  "Expected Scarring Torture Reports" ) +
  annotate("text", x = 8, y = 30, label = "Restricted Access", size=3.5, colour="gray30")+
  annotate("text", x = 8, y = 11, label = "Unrestricted Access", size=3.5, colour="gray30")

As protests increase, we can see little change in scarring torture where IGO access is unrestricted. Transparency makes repressive acts harder to execute. On the other hand, where IGO access is restricted, scarring torture increases at considerably faster rate as protests increase. The confidence bands do not overlap, suggesting the slopes are different for the two groups.

Marginal Effects

We can also compute the marginal effects (as above). Let’s look at the marginal effect of restricting access to IGOs on scarring torture as protests increase.2

In terms of the regression above where R stores the model coefficients as a vector where the constant is first, the coefficient on protests is second, the coefficient on restricted access is third, and the coefficient on the interaction term is fourth, the marginal effect of restricting access on scarring torture across protests would be:

\[ \frac{\partial{y}}{\partial{d_1}}= \beta_3+\beta_4*x_2 \]

or the coefficient on restricted access plus the coefficient on the interaction term times the value of protests.

code
df <- df %>% mutate(protest=seq(1,40,1),
  MEra=m4$coefficients[3]+m4$coefficients[4]*protest, 
  se = sqrt(vcov(m4)[3,3]+vcov(m4)[4,4]*protest^2+2*vcov(m4)[2,3]*protest),   
  ub=MEra+1.96*se, lb=MEra-1.96*se)

ggplot() +
  geom_line(data=df, aes(x=protest, y=MEra) ) +
  geom_ribbon(data=df, aes(x=protest, ymin=lb, ymax=ub),fill = "grey70", alpha = .4, ) + labs( x = "Protests", y =  "Marginal Effect of Access Restriction on Scarring Torture")

In this plot, the marginal effect of restricting IGO access over the range of protests is positive, and increasing - it is above zero, and it has a positive slope. These are not redundant statements. The marginal effect could be negative, but increasing, or positive but decreasing. It’s important to idenitfy the location (above or below zero, or not different from zero), and its slope (positive, negative, zero).

One last observation (shown in in the simulated data earlier) is that the marginal effect is the difference between the two lines in the linear prediction plot above. The marginal effect is the difference between the two groups (restricted, unrestricted) over the values of protests.


References

Berry, William D, Matt Golder, and Daniel Milton. 2012. “Improving Tests of Theories Positing Interaction.” Journal of Politics 74 (3): 653–71.
Brambor, T., W. R. Clark, and M. Golder. 2006. “Understanding Interaction Models: Improving Empirical Analyses.” Political Analysis 14 (1): 63–82.
Clark, William Roberts, and Matt Golder. 2023. Interaction Models: Specification and Interpretation. Cambridge University Press.
Back to top

Footnotes

  1. It should be evident my argument implies both that protests influence scarring torture and that torture shapes protests; this simultaneity, if it exists, means the estimates are both biased an inefficient, and we should find a way to model this interesting source endogeneity. We’ll do so in week 15.↩︎

  2. Since restricted access is binary, this is really a first difference rather than a marginal effect.↩︎